Semi-automatic Parsing for Web Knowledge Extraction through Semantic Annotation

نویسنده

  • Maria Pia di Buono
چکیده

Parsing Web information, namely parsing content to find relevant documents on the basis of a user’s query, represents a crucial step to guarantee fast and accurate Information Retrieval (IR). Generally, an automated approach to such task is considered faster and cheaper than manual systems. Nevertheless, results do not seem have a high level of accuracy, indeed, as also Hjorland (2007) states, using stochastic algorithms entails low precision, low recall and generic results. Usually IR systems are based on invert text index, namely an index data structure storing a mapping from content to its locations in a database file, or in a document or a set of documents. In this paper we propose a system, by means of which we will develop a search engine able to process online documents, starting from a natural language query, and to return information to users. The proposed approach, based on the Lexicon-Grammar (LG) framework and its language formalization methodologies, aims at integrating a semantic annotation process for both query analysis and document retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistic Annotation for the Semantic Web

Establishing the semantic web on a large scale implies the widespread annotation of web documents with ontology-based knowledge markup. For this purpose, tools have been developed that allow for semi-automatic annotation of web documents with ontology-based metadata. However, given that a large number of web documents consist either fully or at least partially of free text, language technology ...

متن کامل

S-CREAM: Semiautomatic CREAtion of Metadata

Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, SCREAM, that allows for creation of metadata and is trainable for a specific domain. Annotating web documents is one of the major techniques for creating metadata on the web. The implementation of S-CREAM, OntoMat supports now the semi-automatic annotation of web pages. This semi-a...

متن کامل

Technical Report: Semantic Annotation Platforms

Semantic annotation is a key component for the realization of the Semantic Web. The volume of existing and new documents on the Web makes manual annotation problematic. Semi-automatic methods have been designed to alleviate the burden, and these methods have begun to be implemented with Semantic Annotation Platforms (SAPs). SAPs provide services that support annotation, including ontologies, kn...

متن کامل

Further use of Controlled Natural Language for Semantic Annotation of Wikis

Knowledge Acquisition through Semantic Annotation is vital to the evolution, growth and success of the Semantic Web. Both Semiautomatic and Manual Annotation are constricted by a knowledge acquisition bottleneck. Manual Semantic Annotation is a complex and arduous task both time-consuming and costly, often requiring specialist annotators. Therefore, automation of this process is essential to ea...

متن کامل

Semantator: A Semi-automatic Semantic Annotation Tool for Clinical Narratives

In this paper, we introduce Semantator, a semi-automatic tool for document annotation with Semantic Web ontologies. With a loaded free text document and an ontology, users can annotate document fragments with classes in the ontology to create instances and relate created instances with ontology properties. Also, Semantator enables automatic annotation by connecting to the NCBO annotator and the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016